Cerebrospinal fluid biomarkers for assessing Huntington disease onset and severity

Abstract The identification of molecular biomarkers in CSF from individuals affected by Huntington disease may help improve predictions of disease onset, better define disease progression and could facilitate the evaluation of potential therapies. The primary objective of our study was to investigate novel CSF protein candidates and replicate previously reported protein biomarker changes in CSF from Huntington disease mutation carriers and healthy controls. Our secondary objective was to compare the discriminatory potential of individual protein analytes and combinations of CSF protein markers for stratifying individuals based on the severity of Huntington disease. We conducted a hypothesis-driven analysis of 26 pre-specified protein analytes in CSF from 16 manifest Huntington disease subjects, eight premanifest Huntington disease mutation carriers and eight healthy control individuals using parallel-reaction monitoring mass spectrometry. In addition to reproducing reported changes in previously investigated CSF biomarkers (NEFL, PDYN, and PENK), we also identified novel exploratory CSF proteins (C1QB, CNR1, GNAL, IDO1, IGF2, and PPP1R1B) whose levels were altered in Huntington disease mutation carriers and/or across stages of disease. Moreover, we report strong associations of select CSF proteins with clinical measures of disease severity in manifest Huntington disease subjects (C1QB, CNR1, NEFL, PDYN, PPP1R1B, and TTR) and with years to predicted disease onset in premanifest Huntington disease mutation carriers (ALB, C4B, CTSD, IGHG1, and TTR). Using receiver operating characteristic curve analysis, we identified PENK as being the most discriminant CSF protein for stratifying Huntington disease mutation carriers from controls. We also identified exploratory multi-marker CSF protein panels that improved discrimination of premanifest Huntington disease mutation carriers from controls (PENK, ALB and NEFL), early/mid-stage Huntington disease from premanifest mutation carriers (PPP1R1B, TTR, CHI3L1, and CTSD), and late-stage from early/mid-stage Huntington disease (CNR1, PPP1R1B, BDNF, APOE, and IGHG1) compared with individual CSF proteins. In this study, we demonstrate that combinations of CSF proteins can outperform individual markers for stratifying individuals based on Huntington disease mutation status and disease severity. Moreover, we define exploratory multi-marker CSF protein panels that, if validated, may be used to improve the accuracy of disease-onset predictions, complement existing clinical and imaging biomarkers for monitoring the severity of Huntington disease, and potentially for assessing therapeutic response in clinical trials. Additional studies with CSF collected from larger cohorts of Huntington disease mutation carriers are needed to replicate these exploratory findings.


Introduction
Huntington disease (HD) is an autosomal dominant neurodegenerative disease caused by a cytosine-adenineguanine (CAG) expansion in the HTT gene that codes for an abnormal polyglutamine tract in the huntingtin protein (HTT). 1 Polyglutamine-expanded mutant huntingtin (mHTT), the primary pathogenic cause of HD, leads to the progressive loss of neuronal populations in the striatum, as well as other structures of the basal ganglia and the cerebral cortex. [2][3][4][5][6][7] HD typically manifests in the clinic as an adult-onset disease with affected individuals presenting with cognitive, motor and psychiatric disturbances. 8 Prior to clinical diagnosis, there is a premanifest or prodromal stage of HD when cellular dysfunction and progressive neurodegeneration are occurring in the brain but no overt symptoms are present. Age-of-onset, a time point when HD mutation carriers develop unequivocal motor signs of HD, is inversely correlated with CAG repeat length in expanded HTT, 9 enabling broad predictions of disease onset. 10 However, CAG repeat length only accounts for 50-60% of the variability, 10,11 with other genetic and environmental factors reported to modify age-of-onset. [12][13][14][15] To date, there are no approved therapies to delay the onset or slow the progression of HD. Therapeutic approaches targeting the cause of HD, the CAG expanded HTT gene and its products, or downstream processes associated with the pathogenesis of HD, are currently in clinical development. Such therapies may be most effective if intervention is initiated prior to clinical onset and significant neurodegeneration in the brain.
CSF is an accessible biofluid whose molecular composition reflects structural and functional changes in the brain, making it a promising biofluid for biomarker discovery in HD and other neurodegenerative disorders. In HD, CSF biomarkers may offer the potential to monitor cell-type and/or pathway-specific pathophysiological alterations in the CNS over the natural history of the disease. Sensitive CSF biomarkers that reflect early cellular dysfunction or neurodegeneration in the brain during the premanifest stage of HD may help improve the accuracy of disease-onset predictions and could potentially guide the appropriate timing for therapeutic intervention. Moreover, such biomarkers could be used to complement existing clinical 16,17 and imagingbased 18,19 biomarkers for monitoring disease progression and assessing the efficacy of candidate therapies in HD clinical trials.
CSF mHTT increases with disease progression 25 and its levels correlate with clinical measures of disease severity. [22][23][24][25] Importantly, a dose-dependent reduction of CSF mHTT was observed in a Phase I/IIa clinical trial evaluating a HTT-targeted antisense oligonucleotide (tominersen) delivered by intrathecal infusion, suggesting that CSF mHTT could be a valuable biomarker to assess target engagement in the CNS. 33 However, preliminary findings from the halted Phase III trial evaluating the efficacy of tominersen (NCT03761849) suggest that a reduction of CSF mHTT alone may not predict clinical benefit.
NEFL in biofluids is a biomarker of neuronal injury, with elevated NEFL levels in CSF and blood reported in HD, 24,25,[27][28][29][30][31] as well as other neurological diseases (reviewed in 34 ). In HD, NEFL levels in biofluids are correlated with clinical and imaging measures of disease 24,27 and are a strong prognostic biomarker of disease onset, progression and brain atrophy in HD patients. 25,27,29 Notably, NEFL is being used in HD clinical trials as an exploratory biomarker to monitor disease progression and to assess therapeutic efficacy. However, it remains unknown if NEFL in biofluids will respond to candidate therapies in a manner that reflects clinical benefit.
We conducted a hypothesis-driven analysis of 26 prespecified proteins in the CSF from 16 manifest HD (manHD) patients, 8 premanifest HD (preHD) and 8 control individuals using nanoflow liquid chromatography-coupled parallel-reaction monitoring mass spectrometry (nanoLC-PRM-MS). This methodology allows for the simultaneous identification and quantification of more than 30 peptides at attomole concentrations within a single run, [35][36][37] allowing for reliable monitoring of CSF proteins with high specificity and sensitivity. An initial list of protein candidates was prioritized based on existing literature demonstrating altered levels in the CSF of HD mutation carriers, including C1QC, 38 C4B, 38 CHI3L1 (also known as YKL-40), 30,38,39 CLU,40,41 CTSD, 38 FAT2, 41 NEFL, PDYN, 42 PENK, 41 and TTR. 38,41,43 Additional protein candidates were selected that, to our knowledge, have not been previously measured in HD CSF but were either reported to have altered expression in the striatum of HD patients, [44][45][46] animal models of HD, 47,48 or have been implicated in the pathogenesis of HD. 49,50 In this study, we sought to investigate novel CSF protein candidates and replicate previously reported molecular biomarker changes in CSF from HD mutation carriers and healthy controls. We assess potential associations between levels of CSF protein analytes, as well as correlations of candidate CSF proteins with clinical measures of disease severity. Finally, we compare the discriminatory potential of individual CSF proteins and combinations of CSF protein markers to assess their sensitivity and specificity for stratifying subjects based on HD mutation status and disease severity.

Study participants
A hypothesis-driven analysis of protein analytes was performed in CSF from 16 manHD, eight preHD and eight healthy control individuals recruited through the University of British Columbia's Centre for Huntington Disease. PreHD was defined as individuals with HTT CAG repeat expansions >36 and a Unified Huntington's Disease Rating Scale (UHDRS) diagnostic confidence level (DCL) < 3, whereas manHD was defined as individuals with a HTT CAG repeat expansion >36 and a DCL = 4. HD mutation carriers refer to both preHD and manHD individuals. Healthy control individuals with no neurological abnormalities and HTT CAG repeat lengths <36 were selected to span the range of ages of HD mutation carriers.
Clinical outcomes including total functional capacity (TFC), total motor score (TMS), verbal fluency (VF), symbol digit modality test (SDMT), and Stroop word reading (SWR) were scored by a trained neurologist using the UHDRS. 16 CAG-age product (CAP) score or disease burden score was calculated using the formula: (CAG length-35.5) × age. 51 Predicted age-of-onset estimates in preHD individuals were calculated according to the formula: 21.54 + EXP(9.556-0.146 CAG) and years to predicted disease onset was estimated by subtracting the individual's age at the time of CSF collection. 10

CSF collection
CSF samples from Huntington disease mutation carriers and control individuals were collected at the University of British Columbia's Centre for Huntington Disease. CSF was obtained by lumbar puncture, examined qualitatively by microscopy and centrifuged to remove cells. The acellular supernatant was aliquoted and frozen at −80°C.

Study approval and patient consent
All CSF samples were collected under an approved protocol (H14-03131) in accordance with the guidelines of the institutional review board of the University of British Columbia and with the full informed consent of the subjects.

Parallel-reaction monitoring mass spectrometry
A panel of 26 proteins was measured in CSF by nanoLC-PRM-MS. For sample preparation, each CSF sample was reduced, alkylated, and trypsin digested as previously described 52,53 and cleaned using detergent removal spin columns (Thermo Fisher Scientific, catalog # 87777) as per the manufacturer's protocol. The samples were acidified with 1% formic acid (EMD Millipore) and loaded on a reversedphase UltiMate™ 3000 RSLC-nano System with ProFlow Meter (Thermo Fisher) coupled with Orbitrap Eclipse™ Tribrid™ mass spectrometer (Thermo Fisher) for analysis with a nano-electrospray interface operated in positive ion mode. Prior to PRM analysis, 112 peptides corresponding to 2-15 peptides per protein (Supplementary Table 1) were identified and validated using data-dependent acquisition (DDA) and split among four nanoLC-PRM-MS runs. The DDA and nanoLC-PRM-MS analysis involved injection and loading of ∼0.1-0.2 μg of the peptide sample onto a 300 µm I.D. × 0.5 mm 3 µm PepMaps® C18 trap (Thermo Fisher) followed by separation on a 100 µm I.D. × 10 cm 1.7 µm BEH130C18 nanoLC column (Waters, Milford, MA, USA). The eluted peptides were ionized by electrospray ionization for either DDA or nanoLC-PRM-MS analysis and the data for MS/MS was acquired in the Orbitrap on ions with mass-to-charge values between 375 and 1800 at a resolution of 60 000 followed by higher-energy collisional dissociation fragmentation and PRM scans. Raw data extraction and data analysis were performed using Skyline software version 3.7 (https://skyline.ms) and MatchRx software version 3.0 as previously described. 53 The extracted peptide intensities (peak areas) were normalized against a median intensity value calculated from all peptide intensities in each run.

Statistical analysis
Statistical analyses were performed using GraphPad Prism 9 (GraphPad) and R statistical software, 54 using the Caret 55 and MixOmics 56 packages for modelling. Alpha values of <0.05 were considered significant for all analyses.
Comparisons of demographic characteristics and clinical measures between groups were assessed by ANOVA and Fisher's least significant difference test. Mean values ± standard deviation (SD) for each group are presented. CAP scores were compared between preHD and manHD individuals using a two-tailed t-test. Differences in gender distributions between groups were assessed using Pearson's χ 2 test.
Age, sex and CAG repeat length were considered potential confounding factors for comparisons of CSF protein levels between groups. The relationship of normalized CSF protein concentrations with age and sex was evaluated in control individuals using either Pearson's correlation or independent unpaired t-tests, respectively. The association of CSF protein levels with CAG repeat length in all HD mutation carriers was assessed using Pearson's correlation. Only age was found to be significantly associated with CSF protein levels and was included as a covariate for all subsequent analyses. Normalized CSF protein concentrations for all individuals were adjusted for age using linear regression.
Pre-specified analyses comparing age-adjusted CSF protein levels between controls and all HD mutation carriers were performed using general linear models (GLMs) bootstrapped with 1000 repetitions. P-values and the percentage of events in 1000 bootstrap repetitions that the variable was selected with P < 0.05 are reported for each comparison. Odds ratios (OR) and 95% confidence intervals (CIs) for statistically significant comparisons are presented.
Comparisons across disease stages were performed by analysis of covariance (ANCOVA) including age as a covariate, and F statistics, degrees of freedom, and P-values for each comparison are reported. Post hoc tests between disease stages were performed using Tukey's test to correct for multiple comparisons and mean difference (MD) effect sizes, 95% CI and P-values for statistically significant comparisons are reported.
Associations of clinical measures of disease severity with CSF protein levels were assessed in manHD individuals using Spearman's partial rank correlation including age as a covariate, with the exception of CAP score which was evaluated in all HD mutation carriers using unadjusted data. The relationship of CSF protein levels with years to predicted disease onset was assessed in preHD subjects using Pearson's correlation on unadjusted data. Associations between each of the 26 CSF protein analytes were evaluated in all HD mutation carriers using Pearson's partial correlation including age as a covariate. Coefficient values (Spearman's ρ or Pearson's r) from ±0.50 to ±1 were considered strong correlations, ± 0.30 to ±0.49 were considered moderate correlations and ±0.10 to ±0.29 were considered weak correlations. P-values <0.05 define correlations significantly different than 0.
The sensitivity (% of individuals with the target condition that the test correctly identifies as positive) and specificity (% of individuals without the target condition that the test correctly identifies as negative) of each individual CSF protein for discriminating between disease groups/stages were assessed using receiver operating characteristic (ROC) curve analysis, and the corresponding area under the curve (AUC) values were computed as a measure of discriminatory performance or accuracy. CSF proteins with AUC = 0.8-1 were considered as being classifiers with high discriminatory ability, values of 0.7-0.8 as having moderate discriminatory ability, and 0.6-0.7 as classifiers with weak discriminatory ability. AUC values, AUC 95% CIs and P-values for each test are reported. AUC 95% CI was computed using the Wilson/Brown hybrid method and AUCs were compared as described by DeLong et al. 57 Sparse partial least squares discriminant analysis (sPLS-DA) is a supervised machine learning method that examines the discriminative capacity of multi-dimensional data while selecting features best able to classify samples. For each comparison, the sPLS-DA model was tuned to find the appropriate number of components and variables using 50 × 3-fold repeated cross-validation. Then, a final sPLS-DA model was fit using the optimal number of proteins for the respective optimal number of components, as determined during the tuning phase to avoid overfitting. This entire process was bootstrapped with 1000 repetitions to assess the variability and stability of the final models. ROC curves for the final sPLS-DA model were then generated and AUC values, AUC 95% CI and P-values are reported.
Multi-marker ROC curves were generated using the CombiROC analytical tool. 58 Data sets comprising age-adjusted values from up to 10 CSF proteins were uploaded into the web-based interface, and analysis was performed without further processing of the data. Test-signal cut-offs, as well as sensitivity and selectivity thresholds, were adjusted for different group comparisons. ROC curves with combinations of up to five proteins were plotted and AUC values are reported.

Demographics and clinical characteristics of study participants
Study participant demographics and clinical scores are summarized in Table 1. Our study included 8 healthy controls, 8 preHD mutation carriers and 16 manHD subjects. A significant age difference between groups was observed, with manHD patients being significantly older than preHD individuals (mean age ± SD = 52.12 ± 11.94 versus 37.18 ± 9.08. P = 0.011). Healthy controls were selected to span the age range of HD mutation carriers and no significant age differences were observed compared with either preHD (P = 0.119) or manHD subjects (P = 0.398). There were no significant differences in sex distributions between groups (χ 2 : 0.254, P = 0.881) or CAG repeat lengths between preHD and manHD patients (mean CAG ± SD = 43.64 ± 1.51 versus 44.50 ± 2.78. P = 0.196).

Comparison of CSF protein levels across disease stages
We pre-specified 26 protein analytes to measure in CSF from controls and HD mutation carriers, including previously investigated CSF proteins, as well as exploratory candidate proteins that, to our knowledge, have never been investigated in HD CSF (Table 2).
We utilized a nanoLC-PRM-MS method to quantify unique peptides derived from each of the 26 CSF proteins with high sensitivity and specificity. For each protein, 2-15 unique peptides were measured in parallel. A complete list of peptide sequences measured by nanoLC-PRM-MS is presented in Supplementary Table 1. We observed moderate to strong positive correlations between normalized unadjusted values for each peptide from all protein candidates assessed, suggesting a reliable measurement of these proteins in CSF (Supplementary Table 1). Mean normalized peptide concentrations for each CSF protein were then adjusted to control for the effects of age, and residuals were used for subsequent analyses.
We next investigated whether CSF protein levels were altered across stages of the disease. HD mutation carriers were divided based on DCL into preHD (DCL < 3) and manHD groups (DCL = 4), and the manHD group was further stratified based on TFC score into early/mid HD (TFC > 5) and late HD (TFC < 5) groups. A comparison of age-adjusted CSF protein levels was performed between controls, preHD, early/ mid HD and late HD groups by ANCOVA followed by post hoc analysis using Tukey's test to correct for multiple comparisons (Supplementary Table 2). We identified eight CSF proteins that were significantly altered across disease stages (Fig. 1).

Correlations of CSF protein levels with clinical measures of disease severity
Correlations of CSF protein levels with CAP score, an agedependent measure of cumulative exposure to CAG expanded HTT, were performed on unadjusted values using Spearman's rank correlation in all HD mutation carriers (Table 3) The relationship of CSF protein levels with clinical measures of disease severity in manHD individuals was evaluated using Spearman's partial rank correlation including age as a covariate (Table 3)   We also assessed the relationship of unadjusted CSF protein levels with years to predicted disease onset 10 in preHD mutation carriers (mean years to predicted onset ± SD = 8.41 ± 8.04) using Pearson's correlation and found that ALB (r = 0.75, P = 0.031), C4B (r = −0.74, P = 0.036), CTSD (r = 0.66, P = 0.08), IGHG1 (r = 0.85, P = 0.008) and TTR (r = 0.86, P = 0.006) showed strong associations with these estimates (Supplementary Table 3).

Correlations between CSF protein analytes in Huntington disease mutation carriers
The relationship between individual CSF protein analytes in HD mutation carriers was assessed using Pearson's partial correlation (Supplementary Table 4). Functional enrichment analysis was performed using all 26 CSF proteins to identify overlap in biological processes related to the pathophysiology of Huntington disease. 59 We observed moderate to strong correlations between levels of CSF proteins involved in neuronal function, motor behaviour, cognition and memory, synapse organization and plasticity, apoptosis/cell death, as well as immune and complement pathway activation.

Discriminatory potential of CSF protein markers for Huntington disease
We next used ROC curve analysis to evaluate the sensitivity and specificity of each CSF protein for discriminating between either HD mutation carriers and controls, preHD and controls, or manHD and preHD. For each test, AUC values were computed as a measure of discriminatory performance for  Table 5). PENK showed the strongest discriminatory ability of any CSF protein for distinguishing between HD mutations carriers and controls ( Fig. 2A, AUC = 0.94, 95% CI: 0.86-1.00, P = 0.0003), accurately classifying 79.2% of HD mutation carriers and 100% of control individuals. PENK was also the most discriminant CSF protein for distinguishing preHD from controls (Fig. 2B, AUC = 0.92, 95% CI: 0.78-1.00, P = 0.005), correctly classifying 75% of preHD and 100% of control individuals. CHI3L1 showed moderate discriminatory power for distinguishing between manHD and preHD individuals (Fig. 2C, AUC = 0.70, 95% CI: 0.42-0.98, P = 0.111), accurately classifying 93.8% of manHD but only 62.5% of preHD individuals.
CSF NEFL was previously shown to have high accuracy for stratifying HD mutation carriers from controls and manHD from preHD individuals. 24 We observed that NEFL showed a strong discriminatory ability for distinguishing between HD mutation carriers and controls (Supplementary Table 5, AUC = 0.81, 95% CI: 0.62-1.00, P = 0.009), but relatively weak discrimination of manHD from preHD subjects in our cohort (Fig. 2C, AUC = 0.69, 95% CI: 0.44-0.94, P = 0.142). By comparison, PENK showed a superior discriminatory ability to NEFL for distinguishing between HD mutation carriers and controls, but this did not reach statistical significance (Supplementary Fig. 1, P = 0.121).
We next performed sPLS-DA to evaluate the discriminatory potential of combining all 26 CSF proteins. The relative discriminatory importance of individual CSF proteins to each sPLS-DA model is presented in Supplementary Fig. 2.

Multi-marker CSF protein panels for stratifying subjects based on Huntington disease mutation status and disease severity
We next performed a combinatorial ROC curve analysis using the combiROC analytical tool 58 to identify marker combinations, comprising the fewest number of CSF proteins (up to five), that could provide the highest discriminatory ability for stratifying individuals based on HD mutation status and disease severity. Each of the most discriminant multi-marker combinations is presented in Supplementary Fig. 3.
The combination of PENK, IGHG1, and GNAL was able to accurately classify 88% of HD mutation carriers and 100% of control individuals, and improved discriminatory performance (Fig. 3A, AUC = 0.98) beyond what was observed for any individual protein ( Fig. 2A, PENK AUC = 0.94) or the combination of all 26 CSF proteins by sPLS-DA (Fig. 2F, AUC = 0.90).
We identified eight unique combinations of three CSF proteins that showed perfect classification of preHD and controls, including combination 3A: PENK, ALB and NEFL (Fig. 3B, AUC = 1). These three marker panels showed superior discriminatory performance compared with PENK alone (Fig. 2B, AUC = 0.92) and the combination of all proteins (Fig. 2I, AUC = 0.88).

Discussion
In this study, we utilized nanoLC-PRM-MS to quantify levels of 26 proteins in CSF from HD mutation carriers and healthy control individuals. Our primary objective was to replicate previously reported changes in CSF protein markers and to investigate whether novel candidate CSF proteins were altered in HD. Consistent with previous reports, we observed that NEFL, 24,25,27-31 PENK, 41 PDYN 42 and CTSD 38 were significantly altered in the CSF of HD mutation carriers compared with controls after adjustment for age.
Multiple studies have reproducibly shown increased levels of blood and CSF NEFL in HD. 24,25,[27][28][29][30][31] Elevated levels of NEFL in biofluids have also been reported in other neurological diseases (reviewed in 34 ) highlighting its utility as a biomarker of neuronal injury, but one that is not specific to HD. NEFL is currently being used in HD clinical trials as an exploratory biomarker of disease progression and to assess therapeutic efficacy.
We found NEFL levels to be significantly increased in the CSF of late HD subjects compared with control individuals (P = 0.002), and trends towards elevated NEFL in early/ mid HD compared with controls (P = 0.073) and late HD compared with preHD (P = 0.068). We did not observe a significant increase of CSF NEFL in manHD compared with the preHD, as reported previously using immunoassays to measure NEFL. 24,27,30 We did however measure significant associations of CSF NEFL with CAP (ρ = 0.44) and SWR scores (ρ = −0.50). 24,30 Our findings support the continued use of NEFL as an exploratory biomarker for monitoring disease severity in clinical trials for HD.
PENK and PDYN are highly expressed in distinct striatal medium spiny neuron (MSN) populations 2 and are downregulated in the caudate of post-mortem HD brains. 44 Both PENK and PDYN precursor proteins are cleaved to generate secreted peptides that modulate neurotransmission and regulate various neural functions in the brain. PENK levels in CSF were reported to be decreased in manHD compared with preHD and healthy controls using LC-MS. 41 We measured a significant reduction of PENK in preHD (P = 0.004), early/mid HD (P = 0.012) and late HD (P = 0.0002) compared with controls and observed moderate correlations with CAP score (ρ = −0.48) and SDMT (ρ = 0.42).
Reduced CSF PDYN was recently reported in manHD patients compared with controls using targeted LC-MS. 42 This study found that levels of PDYN were not decreased in other neurodegenerative diseases, including Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis, suggesting that changes of CSF PDYN may be unique to HD. 42 We found PDYN to be significantly reduced in preHD (P = 0.004), early/mid HD (P = 0.012) and late HD (P = 0.0003) compared with controls. PDYN also showed strong associations with TFC (ρ = 0.53), VF (ρ = 0.55) and SDMT (ρ = 0.57) in manHD individuals. We postulate that reduced CSF PENK and PDYN in preHD individuals may reflect early functional disturbances in the health of MSNs prior to disease onset and differential loss of specific MSN sub-populations at more advanced stages of HD.
CTSD is a lysosomal protease expressed in the brain that has been shown to promote the degradation of mHTT in primary neurons. 60 Levels of CTSD in the CSF were reported in one study to be significantly decreased in HD mutation carriers by MS 38 and in another to be unchanged between manHD, preHD and controls using PRM-MS. 61 Consistent with these reports, we found CTSD to be significantly reduced in the CSF of HD mutation carriers compared with controls (P = 0.044) but not significantly changed across disease stages.
In contrast to previous reports, we did not detect significant changes in C1QC, 38 C4B, 38 CHI3L1, 30,38,39 CLU, 40,41 FAT2 41 or TTR 38,41,43 protein levels in the CSF of HD mutation carriers compared with controls. These discordant findings could be due to differences in patient demographics and clinical characteristics, methodology used for the detection of protein analytes in CSF, and/or the specific peptides that were selected for analysis by PRM-MS in our study.
BDNF is a growth factor required for the survival of various neuronal populations in the CNS and is downregulated in the caudate and putamen of post-mortem HD brains compared with age-matched controls. 62 Levels of BDNF in the CSF were previously reported to be unchanged across HD stages using an immunoassay. 63 We observed a strong trend towards a reduction of BDNF in late HD compared with controls (P = 0.053) and moderate correlations with TFC (ρ = 0.45) and SDMT (ρ = 0.48). These findings suggest that reduced CSF BDNF may reflect the depletion of BDNF production/release 64 or even the loss of cortical neurons at advanced stages of HD. 5 Additional studies to investigate CSF BDNF as a potential biomarker for HD may be warranted.
CSF to blood ALB 65,66 and IgG quotients, 65 routinely used to measure blood-brain barrier (BBB)/blood-CSF barrier (BCSFB) dysfunction and intrathecal IgG production, were previously found to be unchanged in the CSF of HD mutation carriers compared with controls. Our data showed a strong trend towards increased ALB in HD mutation carriers compared with controls (P = 0.068) and a significant increase of CSF IGHG1 (heavy chain constant domain of IgG) in the late HD compared with controls (P = 0.004). The increased CSF albumin and IGHG1 could reflect neurovascular abnormalities and BBB/BCSFB dysfunction which have been reported in HD. [67][68][69] Moreover, elevated CSF IGHG1 at advanced stages of HD may suggest increased local CNS IgG synthesis, a marker of CNS inflammation.
In addition to reproducing reported changes in previously investigated CSF biomarkers, we also identified novel candidate CSF proteins whose levels were altered in HD CSF. GNAL, which is highly expressed in the basal ganglia, plays an important role in MSN dopamine signalling. 45,70 Reduced levels of GNAL have been reported in the caudate and putamen of HD patients. 44,45 We found GNAL to be significantly elevated in the CSF of HD mutation carriers compared with controls (P = 0.043), which could reflect an increased release from degenerating striatal MSNs. IGF2 is a regulator of neurogenesis, synaptic formation and spine maturation in the brain that plays a role in learning and memory functions. [71][72][73] Importantly, reduced IGF2 levels have been reported in the striatum and plasma from HD patients. 50 We detected significantly elevated IFG2 levels in late HD compared with controls (P = 0.012), and moderate correlations with TFC (ρ = −0.40), TMS (ρ = 0.40), and SWR (ρ = −0.42) in manHD individuals. The unexpected increase of CSF IGF2 in HD mutation carriers is consistent with reports from Alzheimer's disease. 74,75 We postulate that elevated CSF IGF2 may reflect increased release from IGF2-producing cells (e.g. neural stem cells 73 ) or potentially a compensatory neuroprotective mechanism in the brain.
CNR1 is highly expressed in the basal ganglia where it modulates synaptic functions involved in motor behaviour. 76 Early downregulation of CNR1 has been reported in the striatum of HD patients. 4,44 Levels of CNR1 were decreased in late HD compared with controls (P = 0.055) and preHD (P = 0.108) and were significantly reduced in CSF from late HD compared early/mid HD (P = 0.008). Moreover, CSF CNR1 levels were strongly correlated with TFC (ρ = 0.58), VF (ρ = 0.52) and SDMT (ρ = 0.56) in manHD. Reduced CSF CNR1 could be a marker that reflects the loss of CNR1-expressing neurons in the basal ganglia at advanced stages of HD.
C1Q (composed of A, B and C polypeptide chains), a component of the complement C1 recognition complex of the classical pathway, is released from CNS cells in response to inflammatory stimuli in neurodegenerative diseases (reviewed in 77 ). In HD, upregulation of early complement activators and regulators from reactive microglia has been reported in the striatum of HD patients. 78 We found CSF C1QB to be modestly increased in early/mid HD compared with preHD and significantly reduced in late HD compared with early/mid HD (P = 0.008) and controls (P = 0.010). Surprisingly, we did not find C1QC to be significantly altered, although similar trends in levels were observed. C1QB also showed a strong association with SDMT (ρ = 0.51) in manHD individuals. These findings suggest that CSF C1QB levels may reflect early HD-associated complement activation in the brain and potential dysregulation of this pathway at more advanced stages of disease. IDO1, a rate-limiting enzyme in the kynurenine pathway, was reported to be upregulated and have increased activity in the striatum of an HD mouse model. 49 IDO1 levels were significantly decreased in late HD compared with controls (P = 0.020), and showed moderate to strong correlations with CAP score (ρ = −0.45), TFC (ρ = 0.43) and VF (ρ = 0.50). The reduction of IDO1 in the CSF could suggest dysregulation of the kynurenine pathway in the brain or it may be a marker of cell loss in the striatum in lateHD.
Together our data suggest that GNAL, IGF2, CNR1, C1QB and IDO1 may represent promising CSF biomarker candidates that reflect distinct HD-associated pathophysiological alterations in the CNS.
A secondary objective of our study was to compare the discriminatory potential of individual CSF markers and combinations of CSF markers for distinguishing individuals based on HD mutation status and disease severity. We identified PENK and PDYN as being the most discriminant individual CSF proteins for distinguishing HD mutation carriers from controls. Notably, PENK (AUC = 0.94) and PDYN (AUC = 0.84) each showed superior discrimination of HD mutation carriers from controls compared with NEFL alone (AUC = 0.81). Moreover, PENK (AUC = 0.92) also showed the highest discriminatory power for distinguishing preHD from controls. No individual CSF protein showed high discriminatory accuracy for distinguishing between preHD and manHD individuals in our cohort, with CHI3L1 (AUC = 0.70) showing only moderate discriminatory power.
sPLS-DA models incorporating all 26 CSF markers used to stratify either HD mutation carriers and controls (AUC = 0.90) or preHD and controls (AUC = 0.88) showed discriminatory performances similar to PENK alone. In contrast, the combination of all CSF proteins improved the stratification of manHD from preHD (AUC = 0.95) compared with CHI3L1 (AUC = 0.70) or any other individual protein, highlighting the increased discriminatory value of combining multiple CSF markers.
We also performed a combinatorial ROC curve analysis and identified exploratory multi-marker CSF panels with up to five proteins that, in all instances, showed superior discriminatory performance compared with individual proteins for distinguishing individuals based on HD mutation status and disease severity.
The combination of PENK, NEFL and ALB showed perfect discrimination between preHD and control individuals in our cohort suggesting that changes in these CSF proteins represent early events in disease pathogenesis, prior to overt symptomatic onset. Furthermore, all eight lead three marker combinations included PENK, highlighting the importance of this protein for distinguishing between preHD and controls.
The panel consisting of CHI3L1, C4B, IGHG1, and ALB showed high discriminatory power (AUC = 0.91) for distinguishing preHD from manHD individuals, with sensitivity and specificity superior to CHI3L1 alone (AUC = 0.70) and similar to that observed with all 26 CSF markers by sPLS-DA (AUC = 0.95). These data highlight the additive classification performance that is possible even when combining markers that individually have a weak or moderate discriminatory ability.
Moreover, we identified 14 unique 4 marker CSF protein panels that showed perfect discrimination of early/mid HD from preHD individuals, including combinations of C4B, TTR, ALB, and CYCS, as well as PPP1R1B, C4B, TTR and CTSD. Notably, ALB (r = 0.75), C4B (r = −0.74), CTSD (r = 0.66) and TTR (r = 0.86) were strongly correlated with years to predicted disease onset in preHD individuals. 10 We postulate that panels of CSF markers could be used in conjunction with CAG repeat length to improve the accuracy of disease-onset estimates.
Multi-marker CSF protein panels that can accurately discriminate between preHD and early/mid HD or manHD individuals could help define the optimal timing of therapeutic intervention. Preclinical studies in animal models of HD would be needed to assess whether any of the exploratory CSF protein combinations respond to candidate therapies in a manner that suggests therapeutic benefit. If validated, such biomarker panels could provide objective exploratory pharmacodynamic measures for monitoring therapeutic response in preventative trials for HD.
Finally, we identified multiple CSF marker panels, including the combination of CNR1, PPP1R1B, BDNF, APOE and IGHG1, that showed perfect classification of late HD and early/mid HD individuals. This combination of CSF proteins likely reflects alterations in neuronal health, neurotrophic support, lipid metabolism, neuroinflammation, and BBB/ BCSFB integrity in the CNS which have been linked with disease severity. Given the complex pathogenesis of HD and associated alterations of numerous biological pathways over the natural history of the disease, it is likely that combinations of molecular biomarkers assessing multiple processes related to HD pathophysiology in parallel will be favoured for use in clinical trials to complement existing clinical and imaging biomarkers. Such panels could provide additional cell-type or pathway-specific resolution into HD-associated pathophysiological changes compared with a single biomarker, such as NEFL, which likely reflects general axonal damage/neuronal injury in the CNS.
MS-based methods are capable of sensitive detection of proteins in biofluids, comparable to other analytical assays, but may provide superior specificity through the identification of multiple specific peptide sequences for any individual protein. 79,80 Furthermore, targeted MS methods have high multiplexing capability (>100 peptides per assay) which is difficult to achieve with conventional assays (e.g. immunoassays). 81 Although MS-based assays may not be practical or costeffective for routine clinical use, the exploratory multi-marker CSF protein panels identified in this study could be used to help guide the design of multiplex immunoassays that would be more amenable to clinical practice. To enable this translation, longitudinal observational studies would be needed to quantify absolute concentrations of individual CSF proteins and define biomarker signatures in HD mutation carriers over the natural history of the disease. We also identified multiple unique CSF protein combinations that are different in composition but that show equivalent discriminatory performance for stratifying subjects based on disease severity which may provide additional flexibility for assay development and could help validation of such assays for clinical use.
The major limitation of our study was the relatively small number of CSF samples analysed. Replication studies with CSF collected from larger cohorts of HD mutation carriers (e.g. HDClarity, NCT02855476) would be needed to validate our exploratory findings. Furthermore, studies to assess whether any of the candidate CSF proteins studied here are also altered in other biofluids (e.g. blood) may be warranted.
In this study, we show compelling evidence to suggest that combinations of CSF markers can outperform individual markers for stratifying individuals based on HD mutation status and disease severity. We postulate that the multi-marker CSF protein panels defined herein may be useful for improving the accuracy of age-of-onset estimates for HD and complement clinical biomarkers for monitoring disease severity. J.L.M. is currently a consultant for Sanofi and Voyager Therapeutics. B.R.L. reports roles as a scientific consultant with sRNAlytics, Teva, Roche/Genentech, Takeda, Triplet, Ionis, Novartis, Spark, Sintetica, LifeEdit, Design, Remix Therapeutics, and PTC Therapeutics. B.R.L. is a co-founder and CEO of Incisive Genetics Inc. M.R.H. serves on the public boards of Ionis Pharmaceuticals, Oxford Biomedica, AbCellera and 89bio. The other authors declare no competing interests.

Supplementary material
Supplementary material is available at Brain Communications online.

Data availability
Data from this study can be made available upon reasonable request.